Query Classification System Based on Snippet Summary Similarities for NTCIR-10 1CLICK-2 Task
نویسندگان
چکیده
A query classification system for NTCIR-10 1CLICK-2 is described in this paper. The system classifies queries in Japanese and English into eight predefined classes by using support vector machines (SVMs) for classification. Feature vectors are created based on snippet similarities instead of snippet word frequency. These vectors, which have fewer dimensions than those made from raw words, reduce the number of parameters of SVMs. Therefore, the system achieves more generalization and reduces computing resources. Two methods for calculating document similarity, cosine similarity and Jaccard index, were compared. Additionally, two snippet sources, Bing search results given by the task organizer and Yahoo! Japan Web search results, were compared. Other methods that add query string information to snippet information for the feature vectors were compared with the above methods. Our system achieved 0.89 accuracy in the English task by cosine similarity and the Yahoo! Japan Web search results, and 0.86 in the Japanese task by cosine similarity and the Bing search results.
منابع مشابه
A Query Classification System based on Snippet Similarity for a One-Click Search
This paper proposes a query classification system for a one-click search system that uses feature vectors based on snippet similarity. The proposed system targets the NTCIR-10 1CLICK-2 query classification subtask and classifies queries in Japanese and English into eight predefined classes by using support vector machines (SVMs). In the NTCIR-9 and NTCIR-10 tasks, most participants used complex...
متن کاملOverview of the NTCIR-10 1CLICK-2 Task
This is an overview of the NTCIR-10 1CLICK-2 task (the second One Click Access task). Given a search query, 1CLICK aims to satisfy the user with a single textual output instead of a ranked list of URLs. Systems are expected to present important pieces of information first and to minimize the amount of text the user has to read. We designed English and Japanese 1CLICK tasks, in which 10 research...
متن کاملTTOKU Summarization Based Systems at NTCIR-10 1CLICK-2 task
We describe our query-oriented summarization system implemented for the NTCIR-10 1CLICK-2 task. Our system is purely based on a summarization method regarding the task as a summarization process. The system calculates relevant scores of terms for a given query, then extracts relevant part of sentences from input sources. For the calculation of relevant scores for a query, we employed a Query Sn...
متن کاملMSRA at NTCIR-10 1CLICK-2
We describe Microsoft Research Asia’s approaches to the NTCIR-10 1CLICK-2 task. We construct the system based on some heuristic rules, and change the setting of our approaches to test the effectiveness of each setting. The evaluation results show the effectiveness of the query attributes.
متن کاملHunter Gatherer: UdeM at 1CLICK-2
We describe our hunter-gartherer system for the NTCIR10 1CLICK-2 task. We inspire ourselves on the DeepQA framework looking to adapt it for the 1CLICK task. Several techniques can be integrated naturally in this framework. The hunter component generates candidates based on the passage retrieval for the original query, the gartherer component collects evidence for each candidate and score them b...
متن کامل